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Score Validation and Theory Elaboration 
of a Jungian Personality Measure 



Abstract 

Jungian measures have proven extremely popular, selling more than 
3 million copies per year for use in career and marital counseling, 
as well as in workplace team building and learning styles 
assessments. The present study investigated the construct validity 
of scores from an alternative measure of Jungian personality, the 
Personal Preferences Self-Description Questionnaire (PPSDQ) . Forms 
of the PPSDQ and the Myers-Briggs measure were completed by 394 
college students. A variety of first- and second-order factor 
structure models, as well as concurrent validity models, were 
evaluated using structural equation modeling (SEM) . Additionally, 
factor invariance across gender was also evaluated using SEM. 
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Score Validation and Theory Elaboration 
of a Jungian Personality Measure 



The growing recognition that tests are not valid or reliable 
(instead, scores have these properties to varying degrees) 
(Thompson, 1992, 1994) has led to the development of methods to 
establish both validity generalization (Hunter & Schmidt, 1990; 
Schmidt & Hunter, 1977) and reliability generalization (Vacha- 
Haase, 1998) . The recognition that validity and reliability of 
scores vary across test administrations leads naturally to 
explorations of (a) the variability of psychometric coefficients 
and (b) the factors that do and do not explain or predict that 
variability. 

This view means that establishing validity is a dynamic 
process in which we apply theory to data to explore validity, but 
we also simultaneously consider the fit of models to data as 
evidence bearing upon whether and in what ways theory should be 
revised or elaborated. Thus, as viewed by Hendrick and Hendrick 
(1986) , "theory building and construct measurement are [invariably] 
joint bootstrap operations" (p. 393) . In a similar vein, Gorsuch 
(1983) has noted regarding factor analysis that, "A prime use of 
factor analysis has been in the development of both the operational 
constructs for an area [theory elaboration] and the operational 
representatives for the theoretical constructs [score validation]" 
(p. 350) . 

Objectives 

Measures of normal variation in personality grounded in 
Jungian theory have been extremely useful in assessing learning 
styles and in career and other counseling applications. For 
example, the measure developed by mother and daughter Myers and 
Briggs (Myers & McCaulley, 1985) "is the most widely used 
personality instrument, with between 1.5 and 2 million persons 
completing it each year" (Jackson, Parker & Dipboye, 1996, p. 99, 
emphasis added) . As Yabroff (1990) noted, the measure "brought 
Jung's typology to a high level of practical application" (p. 6). 
In short, measures of psychological types are among the measures of 
personality most frequently used in educational and counseling 
applications (Thompson & Ackerman, 1994) . 

However, notwithstanding its popularity, the Myers and Briggs' 
measure has provoked considerable psychometric controversy. Paired 
articles debating related measurement issues have appeared, for 
example, in an issue of the Journal of Counseling and Development 
(Carlson, 1989; Healy, 1989) and also in an issue of Measurement 
and Evaluation in Counseling and Development (McCaulley, 1991; 
Merenda, 1991) . 

The measure has been criticized for the use of a forced-choice 
or "ipsative" response format, which causes spurious negative 
correlations among items (Kerlinger, 1986, p. 463) . And the measure 
has been criticized for yielding dichotomized types rather than 
continuous scores, and for not acknowledging that some people may 
have relatively neutral preferences on some dimensions. Therefore, 
an alternative measure of type has been developed by Thompson 
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(1996b) — the Personal Preferences Self-Description Questionnaire 
(PPSDQ) (cf. Arnau, Thompson & Rosen, in press; Kier, Melancon & 
Thompson, 1998; Mittag, in press) . As with the Myers and Briggs 
measure, the PPSDQ yields scores on four dimensions; Extroversion- 
Introversion (El) , Sensation- iNtuition (SN) , Thinking-Feeling (TF) , 
and Judging -Perceiving (JP) . 

The objectives of the present study were both to explore the 
validity of PPSDQ scores and to further elaborate a model of 
personality invoking the Jungian point of view. Specifically, we 
addressed three research questions: 

1. Do PPSDQ scores delineate the expected four-factor 
(Extroversion-Introversion [El], Sensation-iNtuition [SN] , 
Thinking-Feeling [TF] , and Judging-Perceiving [JP]) Jungian 
structure? 

2. Are PPSDQ scores free of gender bias, as reflected by 
parameter invariance across gender? 

3. When both PPSDQ and Myers-Briggs scores are together jointly 
analyzed as measuring normal personality, does a single factor 
emerge in a second-order hierarchical analysis? 

These questions were addressed with structural equation modeling 
techniques (cf. Thompson, in press) with covariance matrices used 
as the bases for the analyses, for the reasons specified by Cudeck 
(1989) . 

Data Source 

Instrumentation 

Both a form of the PPSDQ (Thompson, 1996b) and the Myers- 
Briggs' (cf. Myers & McCaulley, 1985) measures were administered. 
The Myers-Briggs form we used includes 95 scored items that are 
forced-choice. The PPSDQ version we employed includes 59 items, 
which are measured on a seven-point Likert scale. Roughly half the 
PPSDQ items measuring each of the four constructs are reversed in 
their wording so as to minimize response set influences. 
Participants 

We collected PPSDQ and Myers-Briggs data from 420 college 
students enrolled in a private university located in the southern 
United States. There were more females (np=273; 65.0%) than males 
(I1 m= 147; 35.0%) in our sample. The mean age of the sample was 23.82 
fSD=9.58) . Ethnic groups within the sample included; Whites (n=266; 
63.3%), African-Americans (n=75; 17.9%), and Hispanics (n=48; 

11.4%). This sample was reasonably similar to our various previous 
samples (cf. Arnau, Thompson & Rosen, in press; Kier, Melancon & 
Thompson, 1998; Mittag, in press; Thompson & Melancon, 1995), so 
results should be reasonably comparable across our studies. 

We ultimately deleted 22 cases with missing data, and 4 
additional cases detected as outliers as regards data normality. 
There were more females (np=253; 64.2%) than males (11^=141; 35.8%) 
in our final sample of 394 participants. The mean age of the final 
sample was 24.01 fSD=9. 10) ♦ 

Analytic Requirements 

Univariate Normality 

Several requirements must be met before maximum likelihood 
theory should be used as a parameter estimation method. First, the 
data should be distributed multivariate normal (Ashcraft, 1998; 
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Henson, in press) . In assessing this property, a necessary but not 
sufficient condition is univariate normality. For all items 
considered individually, skewness coefficients ranged from -1.013 
to 1.301, and kurtosis coefficients from -1.079 to 1.506. Thus, the 
data were slightly non-normal. 

We attempted to remedy the problem through the use of item 
parcels. Item "parcels” or "testlets" are created by combining 
items in one way or another. It has long been recognized that item 
data can be combined so as to optimize the normality of data (e.g., 
Cattell, 1956; Cattell & Burdsal, 1975; Gorsuch, 1983, pp. 294- 
295) . For example, item "testlets" can be created by pairing item 
responses with opposite skewness (e.g., create parcel 1 on a scale 
by adding the scores on the most negatively skewed item to the 
scores on the most positively skewed item within a given scale) . 

Combining items into "parcels" also results in more 
parsimonious model tests. One feature of this parsimony is that the 
rank of the estimated matrix of associations can be radically 
reduced. For example, if 78 items were the basis of analyses, 
initially 78 variances and 3003 (78 x 77 / 2 = 6006 / 2) unique 
covariances are estimated, and then the parameters to reproduce 
these coefficients are estimated. If the same 78 item responses 
are aggregated only into scores on 36 "doublets," initially only 36 
variances and 630 (36 x 35 / 2 = 1260 / 2) unique covariances are 
estimated, and then the parameters to reproduce these coefficients 
are estimated. 

The number of model parameters is also itself reduced by this 
process. Fitting more parsimonious models to reproduce fewer 
estimated population values in the matrix of associations leaves 
less room for sampling error to impact the estimation process. 
This in turn theoretically leads to results that better generalize. 

It was decided to aggregate the individual items into parcels 
for two reasons. First, while the data did not depart substantially 
from univariate normality, mild departures can compound in the 
multivariate factor space and result in appreciable multivariate 
non-normality. Second, it has been suggested that one have five 
cases for every freed parameter in a given model (Bollen, 1989) . In 
testing a model involving 59 items, this requirement is not met 
with a sample size of 394 unless only one parameter per item is 
estimated. In this case, the potential advantage to using item 
parcels is the ability to obtain more valid model tests and 
estimates, given the small-to-moderate sample size and the relative 
non-normality of this data set. The primary disadvantage is the 
loss of interpretability at the item level. 

Two sets of item parcels were constructed. Under the first 
method, items were paired based on the magnitudes and signs of 
their skewness coefficients. Items skewed negatively were matched 
with those skewed positively in an effort to offset the effect of 
skewness on the data. Three to five parcels were created per 
hypothesized dimension (e.g., four El sublets were created) . The 16 
parcels yielded a mean skewness coefficient of -.048 (^ = .234), 
ranging from -.331 to .556. These values are superior to those 
obtained using individual items and more likely to be distributed 
as multivariate normal. 
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The second method we used to obtain parcels entailed 
exploratory factor analysis of the items hypothesized to saturate 
each construct. An identical number of parcels (16) were 
constructed using this method. These parcels yielded an average 
skewness coefficient of -.108 = .356), ranging from -.813 to 
.520. 

Multivariate Normality 

To assess the sufficient condition (multivariate normality) , 
after ordering cases by the Mahalanobis distance of each case's set 
of scores from the variable centroids, the distance for each 
participant was plotted with the expected chi-square value 
associated with the individual's position in the distribution of 
distance scores. Computer program MULTINOR (Thompson, 1990) was 
used to acquire the graph. This procedure was employed in addition 
to statistical significance tests, due to the inherent limitations 
of statistical tests (Schmidt & Hunter, 1997; Thompson, 1996a, 
1999) . 

Using the present graphical method, one can identify not only 
the individuals contributing to non-normality, but also obtain a 
relative index of degree of normality. Four outliers, identified as 
such by the multivariate plot, were removed. These four 
participants were classified as extreme when using either the 
factor analytic or skewness parcels. For the item parcels, perfect 
multivariate normality could not be assumed since all coordinates 
did not fall along a straight line, as reported in Figure 1; 
however, the departure did not appear to be extreme. 



INSERT FIGURE 1 ABOUT HERE. 

Mardia's coefficient of multivariate kurtosis was also 
computed for the data. For the individual items (i.e., before 
parceling), Mardia's coefficient equalled 535.511 (critical ratio 
= 61.929) . These values indicate a large degree of non-normality in 
the distribution. For the factor analytic item parcels, Mardia's 
coefficient equalled 61.721 (c.r. = 25.524); for the skewness 
parcels, Mardia's coefficient was 44.208 (c.r. = 18.282). Though 
were still non— normal, the degree of non— normality 
diminished substantially from the original items to the factor 
analytic parcels to the skewness parcels. Thus, the skewness 
parcels were used for the primary analyses in the present study. 

Because the data were to be partitioned by sex to evaluate the 
invariance of estimates across gender, each group's distribution 
was tested for multivariate normality as well, as reported in 
Figure 2. The males' parcel scores were more non-normal (Mardia's 
coefficient = 49.704, c.r. = 12.298) than were the females' scores 
(Mardia's coefficient = 31.990, c.r. = 10.601). 



INSERT FIGURE 2 ABOUT HERE. 



Sample Size 

A second requirement for ML estimation is a large sample size 
(Thompson, in press) . The parameter estimates and fit indices are 
only assumed to be valid asymptotically. As explained earlier, item 
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parcels were created to meet the suggested five-cases-per-f reed- 
parameter guideline (Bollen, 1989) . This requirement was met for 
all models involving item packets, given the present sample of 394 
valid cases. The largest number of freed parameters to be estimated 
was 76 for the model testing the invariance of factor structure 
across gender. To meet Bollen's criteria for this model, 380 data 
points would be required to estimate 76 parameters. However, for 
models involving individual items, well over 100 parameters were 
estimated (126 for the correlated factors model) . Consequently, 
results from analyses at the item level should probably not be 
interpreted. Similarly, results from the equal covariances model 
(137 parameters estimated) should be viewed skeptically as well. 

Methodological Issues 

Computer Software 

AMOS 3.6 was used for all analyses. Results obtained from AMOS 
should mirror those obtained using LISREL or EQS. According to Cox 
(1995) , almost all estimates will be identical through two decimal 
places . 

Scaling of Latent Factors 

For the initial analyses, the scales of latent factors were 
set by constraining the factor variances to unity. When testing 
higher-order factors and the invariance of parameters across 
groups, the scale was set using indicators. To choose which 
measured variables would have their paths to constructs set to 
unity for model identification purposes, alpha-if-deleted 
statistics were computed for the complete set of item parcels for 
a given scale. For each of the 4 Jungian (e.g., El) constructs, the 
item parcel for which alpha deteriorated the most if the parcel was 
not used in computing score reliability for the complete parcel set 
was selected to scale the latent factors, because scores on this 
parcel appeared to contribute the most to construct reliability 
(Byrne, 1989, 1994). 

Estimation Method 

Maximum likelihood (ML) estimation maximizes the fit between 
the estimated population variance/ covariance matrix and that 
implied by the model. While generalized least squares (GLS) 
solutions were also obtained for most models, we interpreted the ML 
estimates. Although some suggest using GLS for estimating 
structural models, in simulation studies Chou and Bentler (1995) 
found that ML estimates reject true parameter values more 
consistently than either GLS or ADF (asymptotic distribution free) 
methods (see their Table 3.5 on p. 53). The authors stated, "All 
the fit indices obtained from ML performed much better than those 
obtained from GLS and ADF and should be preferred indicators” (p. 
94) . It seems that ML is superior to the other two theories, at 
least when the data are reasonably multivariate normal. 

Measures of Model Fit 

Following the recommendations of various methodologists (cf. 
Fan, Thompson & Wang, 1999; Hoyle & Ranter, 1995; Hu & Bentler, 
1995, 1999) the chi-square statistic, and p-value; the GFI 
absolute fit index; and the TLI (NNFI) , IFI, and CFI relative fit 
indices were all reported. Additionally, the chi-sguare/df ratio 
along with the RMR, RMSEA, and AIC absolute fit indices were also 
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included. However, there are problems with such indices. McDonald 
(1997) cautioned: 

I do not believe that we presently know how to use 
such indices and in particular I do not believe on 
current evidence that global indices of 
approximation can or should be used as the sole 
basis for a decision that a restrictive model is 
acceptable, (p. 217) 

The chi-square statistics were chosen due to the extreme 
popularity of such measures in SEM, along with their utility and 
validity, when interpreted properly. While statistical significance 
tests usually assume a "nil null" (Cohen, 1994) , nested models do 
not. The change in chi-square actually reflects the difference 
between two plausible models. However, since the probability of the 
test statistic is still largely affected by sample size (cf. 
Thompson, 1996a, 1999) , other indices are needed as well. 

The GFI indexes the relative amount of the observed 
variance/covariance matrix accounted for by the implied model, and 
are analogous to an statistic. Hu and Bentler (1995) noted that 
"Marsh et al. (1988) found that GFI appeared to perform better than 
any other absolute index (e.g., AGFI, CAK, CN, RMR, etc.)" (p. 91). 

The root mean square residual (RMR) is the average of the 
fitted residuals obtained from subtracting the implied model 
variance/covariance matrix from the observed. Hu and Bentler (1995) 
suggested always including the standardized RMR. Becausee Amos does 
not report the standardized RMR, interpretation is more difficult. 
In general, lower values are to be preferred, with 0 indicating a 
perfect fit to the sample data. 

The root mean square error of approximation (RMSEA) measures 
the lack of fit per degree of freedom. It indicates the potential 
fit of the model to the population parameters. Values below .05 are 
considered to be a "close fit", and below .08 "reasonable" (Browne 
& Cudeck, 1993, p. 144). This measure was included due to the 
"strong urgings" of MacCallum to include such an index which 
penalizes for model complexity (1995, p. 30) . 

The Akaike Information Criterion (AIC) was primarily reported 
to compare non-nested models. This absolute index also penalizes 
for increasing the number of parameters being estimated. 

Type-2 relative fit indices are useful for comparing models 
but do not measure explained variance. The Tucker-Lewis Index (TLI) 
denotes the relative improvement in fit per degree of freedom for 
a given model compared with a baseline model. The incremental fit 
index (IFI) is similar but is more consistent across estimators. 
The only Type-3 relative index included here was the comparative 
fit index (CFI) . It first replaces the central chi-square with a 
noncentral chi-square and then measures the relative reduction in 
lack of fit. These Type-2 and -3 indices make use of more 
information, but the assumed distributions (e.g., the noncentral 
chi-square) may be wrong (Hu & Bentler, 1995) . 

In general, accepting models based only on obtaining "the 
'magic' .90 level" (Judd, Jessor, & Donovan, 1986) was avoided. Hu 
and Bentler (1995) stated that such a standard is "clearly an 
inadequate rule” and that "we are hardly able to point to a 
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condit:lon for which it yields appropriate results” (p. 95) . 

Measurement Models Using All Items 
Uncorrelated Factors 

As mentioned earlier, due to the relatively small sample size 
present, one should not interpret parameter estimates or fit 
indices at the item level. The models reported in this section are 
included only for the sake of completeness and possible use in 
future research. As shown in Figure 3, the 59 items were 
hypothesized to reflect four underlying factors. While all 
solutions were proper, the uncorrelated factors model fit the data 
extremely poorly, = 4297.98, p < .001, GFI = .71, as reported in 
Table 1. 



INSERT FIGURE 3 AND TABLE 1 ABOUT HERE. 

The primary reason for the large lack of fit probably stems 
from the number of degrees of freedom in the model, 1710, relative 
to the number of parameters estimated, 120. The data do suggest an 
underlying factor structure, as reflected in the comparison with an 
independence model specifying no covariances among variables, 
change x = 4846.93, p < .001. This relatively uninteresting finding 
simply validates our treatment of the variables as being related in 
some way. 

Correlated Factors 

Allowing the four latent variables to correlate increased the 
fit marginally, change x^ = 354.24, p < .001, change GFI = .02, as 
reported in Table 1. None of the comparative fit indices are large, 
nor is the average RMR value small, as would be hoped. But given 
the large df in this model as well, one would need to take the 
small number of parameters being estimated into account. The x^ / 
^ ratio was 2.31. This ratio suggests a fair fit to the data 
(though one would have to reject this model based on all other 
indices) . 

All of the critical ratios for regression weights were larger 
than 2.0, and most were above 4.0, suggesting that the items are 
important in measuring each construct. The El construct was 
negatively correlated with the other three latent variables, while 
SI and JP yielded an r of .801. Further, variance-accounted-for 
indices indicated that some items were not being measured well by 
the underlying factors (rf for EI31 as low as .032). Future 
assessments of the measurement model underlying the individual 
PPSDQ items should be carried out with a much larger sample size. 
More complex analyses at the item level are not discussed here due 
to the sample size limitation. 

Measurement Models Using item Parcels 
Factor Analytic Parcels 

Almost all fit indices and fit statistics reflected a slightly 
poorer fit in the factor analytic parcels when compared with the 
skewness parcels, as reported in Tables 2 and 3. This was probably 
due to the more severe non-normality of the factor analytic 
parcels, described previously. As noted above, Chou and Bentler 
(1995) found that when the data are multivariate normal ML 
outperformed GLS and ADF methods not only in estimating parameters 




10 



Score Validation -10- 



but also in providing more accurate fit statistics. Because the 
factor analytic parcels were less normally distributed, one would 
expect these fit statistics to be somewhat more inaccurate 
(although one cannot know whether the reduced fit was due to non- 
normality or to an actual poorer fit with these testlets) . 
Consequently, here we interpret primarily the results obtained 
using the skewness parcels. 

INSERT TABLES 2 AND 3 ABOUT HERE. 

Uncorrelated Factors 

The uncorrelated factors model did not result in an acceptable 
fit for the skewness parcels, = 625.35, p < .001, M = 6.01, 
GFI = .83, as reported in Table 3 and Figure 4. [Note that the 
values in the graph beside the observed variables are NOT the error 
variances, but here are the percentages of variance explained by 
the model for each observed variable. ] All squared multiple 
correlations were above .40 excepting four parcels, three of which 
were in the TF factor. These results, coupled with a modification 
index of 159.665 for JP and SI, indicated a potential cross-loading 
problem. 



INSERT FIGURE 4 ABOUT HERE. 



Correlated Factors 

The four correlated factors model (see Figure 5) was the first 
model to fit the sample data well. Though the test statistic was 
statistically significant, x^ — 302.66, p < .001, the absolute fit 
index GFI indicated that 91% of the variability within the 
variance/covariance matrix was being accounted for by the implied 
model. All three relative fit indexes were above .91. Further, the 
RMSEA was below .08, indicating a potentially "acceptable” fit to 
the data. 



INSERT FIGURE 5 ABOUT HERE. 

From inspecting the standardized residual covariances, it 
appeared that one of the SI parcels was correlated with three of 
the TF parcels. Cross— loading the parcel onto the TF factor 
increased the fit marginally (GFI increased from .91 to .917; chi- 
square change « 20) , but the added complexity in interpretation and 
calculation of scale scores would seem to argue against this slight 
and atheoretical increment in fit. Given that this was one of the 
two models theorized to underlie the population data (the other 
including a higher— order factor) , these results indicated a 
relatively good fit between the theory and the data. 

Other Potential Models 

A more restrictive model (df = 104) is the one factor model. 
Here, one general factor is assumed to underlie the covariances 
among variables instead of four. The fit of this model, presented 
in Figure 6, was poor. As reported in Table 3, x^ = 1362.44, p < 
.001, GFI = .64, and the model was abandoned as a viable 
alternative to the four factor hypothesis. 
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INSERT FIGURE 6 ABOUT HERE. 

For the Myers-Briggs data only, a just-identified model was 
estimated specifying four factors. This model (not portrayed here) 
yielded a correlation coefficient between SI and JP of .492. The 
correlated factors model using the PPSDQ data resulted in an even 
higher correlation between the two factors (r = .80) . Because of 
these high correlations, a possible three-factor PPSDQ model was 
hypothesized such that SI and JP items saturated a common single 
factor, as reported in Figure 7. This model fit the data reasonably 

(GFI = .89, CFI = .90) , but not as well as a four factor 
solution (change = 65.02, p < .001). Even in the three-factor 
solution, El continuejd to correlate negatively with every other 
factor in the analysis (r = -.30 with SiJp; r = -.28 with TF) . 

INSERT FIGURE 7 ABOUT HERE. 

Models Nested Within the Four Factor Solution 

Since four correlated factors fit the data better than one, 
three correlated, or four uncorrelated factors, this model was 
chosen as best. But it still remained to be seen whether there are 
higher-order factors that underlie the first-order latent 
variables. Since the fit of first— and second— order factor analyses 
very similar, specifying a higher— order model versus 
correlated first-order factors should be based primarily on theory 
(Byrne, 1994, p. 118). Theory suggests that a global personality 
construct should underlie the psychological types. 

A less restrictive model involving two higher-order factors 
was first tested, as reported in Figure 8. This model fit the data 
adequately, GFI = .91, RMSEA = .07. 

INSERT FIGURE 8 ABOUT HERE. 

In comparing the Figure 8 two higher— order factors solution to 
the only one higher-order factor model, as reported in Figure 9, 
results again indicated no substantial differences between the 
models (GFI = .91 for both, CFI = .92 for both), in spite of the 
statistically significant test statistic, change x^ = 5.58, p = 
.018. However, the percentage of variance explained differed across 
solutionis (cf . Figures 8 and 9) , The two factor solution increased 
the El r from .11 to .17, and boosted the TF r^ 20%. This probably 
occurred due to the large correlation reported earlier (r = .80) 
between SI and JP. The addition of another factor freed the El and 
TF factors to load elsewhere. [Figure 10 presents a similar 
analysis in which all 4 paths from the first-order factors to the 
second— order factor were constrained to equal unity. ] 

INSERT FIGURES 9 AND 10 ABOUT HERE. 

For this reason, one might argue for a two higher— order 
factors solution. But one could also rationally maintain a one 
higher-order factor solution based on parsimony and the absence of 
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any substantive changes in fit indices when comparing the two 
models. The more parsimonious option was selected here for the 
reasons just stated and given the non-statistically significant 
results obtained when a GLS solution was considered (change ~ 
2.059, p = .151). In our opinion, the relatively slight evidence 
favoring a two factor model does not outweigh the more parsimonious 
theoretically meaningful one higher— order factor model. 

The remainder of the analyses in this paper address results 
for the two models that appear to be the most reasonable (i.e., the 
four correlated first-order factors model and the one higher-order 
factor model) . At this point one would usually assess the indirect 
effects of the higher-order factor on each observed variable. But 
given the aggregated data used here (i.e., item parcels), such 
findings would seem to convey little information. 

Between Group Comparisons 
Test of Equal Covariances 

The first between group comparison we assessed was the 
comparability of variance/ covariance matrices across gender. 
Results indicated relative equality between female and male 
covariance matrices, = 161.23, p = .061, GFI = .95, RMSEA = .02, 
as reported in Table 4. These findings suggest that models 
constraining parameters for both groups to be invariant may be 
viable. 



INSERT TABLE 4 ABOUT HERE. 

Tests Assuming Four Correlated First-Order Factors 

Table 4 presents the estimates of the similarity of 
measurement models for female and males. This model, constraining 
parcels to saturate identical factors across groups, fit the data 
reasonably well, controlling for the many degrees of freedom, x^ / 
^ = 2.00, GFI = .89, RMSEA = .05, CFI = .92. The large chi-square 
value (391.92, p < .001) was obtained by adding the chi-square for 
females (x^ = 232.02) with that for males (x^ = 159.84), as reported 
in Table 5. 



INSERT TABLE 5 ABOUT HERE. 

In this case, the overall chi-square was influenced more by 
the lack of fit in the females model (59% of total chi-square) than 
the males (41%) . However, the divergence in percentages is not 
surprising, given that female sample size was almost twice that of 
males, as sample size does itself inflate statistical significance 
tests, even holding model fit completely constant. 

Second, we tested the equality of factor pattern (lambda) 
coefficients for both groups on the four factor solution. The fit 
of this model was comparable to one in which the pattern 
coefficients could vary across groups (change x^ = 18.48, p = .102) . 
Next, the error terms associated with each manifest variable were 
constrained to be equal across gender. Again the differences were 
not statistically significant (change x^ = 19.39, p = .249). 
Finally, the correlations among the four latent variables were 
constrained to be equal across gender, in addition to the previous 
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constraints imposed. Once more this model fit the data roughly as 
well as one allowing all three sets of parameters to vary across 
groups (change x = 11.78, p = .067). 

These persuasive results indicate that the factor structure 
(assuming four first-order variables) for this data set is similar 
for both genders. Final models for each group are presented in 
Figures li and 12 . 

INSERT FIGURES 11 AND 12 ABOUT HERE. 

Tests Assuming One Second-Order Factor 

Because we selected the higher— order factor model as best 
fitting and most parsimonious, we next evaluated the invariance of 
female and male estimates for this model. We ultimately tested 
whether the measurement model underlying both groups contained a 
single higher-order factor. 

First, we compared the similarity of measurement models across 
groups, assuming one higher-order factor. Once degrees of freedom 
were accounted for, this model fit well, ^ = 2.02, GFI = .89, 
RMSEA = .05, CFI = .92. As in previous comparisons, the overall 
chi-square was influenced more by lack of fit in the females 
distribution (x^ = 241.01, representing 60% of variation in the 
overall chi-square) than in the males (x^ = 162.28, representing 40% 
of total) , as reported in Table 5. 

Constraining pattern coefficients to be identical across 
groups did not result in a poorer fitting model (change x^ = 16.93, 
E *152; most fit indices were identical) . Assuming equal error 
variances was tenable (change x^ = 19.13, p = .262) , as was assuming 
equal pattern coefficients for the first-order factors on the 
higher-order factor (change x^ = 9.46, p = .051). Again, these 
results indicate similar factor structure across groups. 

Testing th e Assumption of a Higher- Order Factor 

The final models estimated for each group are depicted in 
Figures 13 and 14. One reason for the slightly poorer fit for the 
females involved the El items. Less factor variance was explained 
for the females (8%) than the males (11%) . This can also be seen in 
the smaller pattern coefficient (-.29 versus -.33) . Further, the TF 
factor was weighted stronger for the females (.56) than the males 
(.50), suggesting that TF , SI, and JP tended to "clump together" in 
the female data, while excluding El. For the males, all four 
factors were more equally related. 

INSERT FIGURES 13 AND 14 ABOUT HERE. 

Between Measure Comparisons 

Having found two models that fit the data adequately for both 
groups of participants (i.e., four correlated first-order factors 
or a one higher-order factor model) , we next decided to compare 
PPSDQ results with scores on the Myers-Briggs . Initially, both 
variables for each Myers-Briggs factor (e.g.. Extroversion score 
and Introversion score) were entered into an analysis with PPSDQ 
item parcels. This resulted in the Myers-Briggs items explaining 
most of the variance in the model because the two Myers-Briggs 
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scale scores for each factor are obtained by summing items 
purported to measure the same construct but scaled in opposite 
directions (e.g., all items measuring Extroversion are summed, as 
are all items measuring Introversion) . Their mutual occurrence in 
a factor analysis distorts results due to the strong linear 
relationship between each scale score within each factor. Thus, 
here a Myers-Briggs composite was created by averaging the two 
scores on each type (e.g. , Extroversion scores were reverse coded 
and then averaged with Introversion scores) . 

Figure 15 depicts the four correlated factors model for the 
two combined measures. [Note that each Myers-Briggs composite is 
the far right indicator under each factor.] This model fit the data 
fairly well, GFI = .89, RMSEA = .07, CFI = .92, as reported in 
Table 6. However, the test statistic was large, = 482.27, p < 
.001. Also, the ^ ratio (2.94) was not as small as desired. 
On inspecting the factor pattern coefficients for the item packets 
and Myers-Briggs variables, the two were generally comparable. Most 
r^'s were above .5 indicating an adequate percentage of variance 
explained for each observed variable (Byrne, 1989) . Exceptions were 
several of the TF item parcels, as was discovered in earlier 
analyses. 



INSERT FIGURE 15 AND TABLE 6 ABOUT HERE. 

A higher-order factor model, reported in Figure 16, fit the 
data as well as a four correlated factors model (change x^ = 4.62, 
P = .099). Again the higher-order factor was being dominated by SI 
(X = .95) and JP (X = .82). In general, both PPSDQ and Myers-Briggs 
scores appeared to be measuring the same constructs. Of course, the 
measures were not perfectly correlated, but this would not be 
desired if the PPSDQ were hypothesized to be an improvement over 
the older Jungian measure. 



INSERT FIGURE 16 ABOUT HERE. 



Conclusions 

Several conclusions can be drawn from these results. First, 
the PPSDQ was found to adequately measure four underlying 
constructs, as hypothesized. Second, there is evidence suggesting 
that a higher-order factor might underlie the four dimensions. This 
construct could be considered a general personality factor. Third, 
interpretive results can be generalized to both females and males; 
there were no gender moderating effects present in this data set. 
Finally, while not measuring the constructs in exactly the same 
manner, the PPSDQ yields results comparable to those obtained from 
the Myers-Briggs measure. 

Measures of Jungian type are among the most commonly used 
measures of normal personality variations across diverse 
applications, including learning styles assessment and guidance 
counseling. However, the Myers-Briggs measure has been criticized 
on the various psychometric grounds summarized previously. The 
present results together with prior results (cf. Arnau, Thompson & 
Rosen, in press; Kier, Melancon & Thompson, 1998; Mittag, in press) 
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suggest that the PPSDQ may be used for the same important purposes, 
while at the same time avoiding various pitfalls associated with 
the alternative measure. 
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Figure Caption 

Figure 1. Assessment of the multivariate normality of item parcels 
for the entire population. 
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Multivariate Normality (All) 




Mahalanobis' distance 
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Figure Caption 

Figure 2 . Assessment of the multivariate normality of item parcels 
for females and males. 
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Multivariate Normality (Females) 




Mahalanobis ' distance 



Multivariate Normality (Males) 




Mahalanobis ' distance 
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Figure Caption 

Figure 3.^. Model depicting four uncorrelated factors (all items 

used) . 
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Figure Caption 

— 4_^ Standardized solution for model with four uncorrelated 
factors (item parcels used) . 



Note. The values in the graph beside the observed variables are not 
the error variances, but here are the percentages of variance 
explained by the model for each observed variable. 
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Figure Caption 

Figure 5 . Standardized solution for model with four correlated 
factors. 
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Figure Caption 

Figure 6. Standardized solution for model with one general factor 
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Figure Caption 

Figure 7 , Standardized solution for model with three correlated 
factors . 
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Figure Caption 

Figure 8. Standardized solution for model with two higher-order 
correlated factors. 
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Figure Caption 

Figure 9 . Standardized solution for model with one higher-order 
factor. 
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Figure Caption 

Figure 10. Standardized solution for model with one higher-order 
factor, first-order to second-order factor paths constrained to be 
unity. 
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Figure Caption 

Ficfure 11. Standardized solution for model with four correlated 
factors, female data only. 
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Figure Caption 

Figure 12 . Standardized solution for model with four correlated 
factors, male data only. 
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Figure Caption 

Figure 13. Standardized solution for model with one higher-order 
factor, female data only. ' 
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Figure Caption 

Figure 14 . Standardized solution for model with one higher-order 
factor, male data only. 
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Figure Caption 

Figure 15 . Standardized solution for model 
factors, PPDSQ and MBTI data combined. 
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with four correlated 



Note . Each MBTI composite is the far right indicator under each 
factor. 
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Figure Caption 

Figure 16. Standardized solution for model with one higher-order 
factor, PPDSQ and MBTI data combined. 
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